From: Mark Birkin Sent: 23 January 2007 13:07 To: Belinda Wu; Andy Turner; 'Paul Townend'; Jie Xu Subject: Moses Technical Meeting Dear All, I think we have agreed to meet on Friday at 1pm. I assume the venue is the same as last time. I have three items to suggest for the agenda: 1. Moses as a Virtual Organisation. It might be appropriate for us to review the presentation from Steven Pickles at last week's e-infrastructure workshop. These should be on the web, but I can't find them. I have e-mailed Steven to ask for copies. 2. Responsibilities and interdependencies. There has been a flow of e-mail traffic regarding the data and model interfaces (see below). I'd like to review and check that we all understand who is doing what. 3. Use Case. The document from John England in Leeds Social Services (discussed at last week's meeting, but attached again if you have difficulty tracing it) provides us with a tangible set of user requirements for the portlet. I'd like to at least start a discussion of how we can meet these requirements. In addition to this, there will be actions from the last meeting to report. Since Paul chaired that last meeting, I'd like to propose that either Belinda or Andy leads this time. As a minimum, chair should be responsible for documenting action points from one meeting to the next. Since Belinda is busy with RSG this Friday, it may be best for Andy to chair this time around, and Belinda can go next? Please circulate any comments or suggestions of other topics for discussion to the group. Regards, Mark ------------------------------------------------------------- Mark Birkin Centre for Spatial Analysis and Policy (CSAP) School of Geography, University of Leeds, LS2 9JT, UK. Email: m.h.birkin@leeds.ac.uk; Tel: +44 (0)113 34 36838 Homepage: http://www.geog.leeds.ac.uk/people/m.birkin/ CSAP: http://www.geog.leeds.ac.uk/research/asap/ ------------------------------------------------------------- Hi, Andy, For the data transfer we need to agree a format. Now, for the time being, I think it best that we pass a small ASCII CSV format file containing something along the lines of the following: AREA CODE, PERSON ID, HH ID, HSAR ID, ISAR ID, AGE, +? As agreed on last meeting, csv will be fine. The question is how does this proposal solve the problem of one variable in fact is still two? Correct me if I'm wrong, but I still see this only forward compatibility issue between the ISAR and HSAR data to the Dynamic Model. The thing is: If we need to develop a model include the CEP, then we need to sort out this problem. As Jie and Paul have pointed out (and I agree), our model needs to produce reasonable results, therefore, there's a lot of important work ahead still. Given the time scale and workload, we simply can't afford to spend time developing a lot of code only to throw away in the future again. (The database implementation is not trivial and I had already made the sacrifice and thrown a database implementation away already. ) I'd rather work on the refinement of the Migration or the Agent Based Modelling etc. to produce better results. I also propose to provide a JAR file and API with the necessary convenience classes and methods that allow for the manipulation of the data into a form you can readily load into the database/format into files used as input to the dynamic model. Again, I do not object so much to using the JAR file and APIs if only minimal pre-processing is needed and I do not need to deal with the Grid.- You and Paul both know that I was trained to program within IDE under Windows only and JAVA is a pretty safe language. - The worst it can do is to bring the run-time environment down. Even if you are willing to train me, we may not want me to crash something etc. and cause us a lot of grief later. Now, when it comes to interfacing/coupling on the portal side of things, it would be good if both Belinda and I provided to Paul using a common interface. If memory serves, I remembered that Paul wanted to obtain information from my database. Therefore not sure what is the common interface. However I'm open to suggestions. Best, Belinda -------------------------------------------------------------------------------- From: Andy Turner Sent: Thu 1/11/2007 1:34 PM To: Belinda Wu Cc: Mark Birkin; pt@comp.leeds.ac.uk; scsjx@leeds.ac.uk Subject: RE: [MoSeS] Integration of demographic model Hi, Sorry Belinda, I'm probably using the wrong term with "integration" perhaps "coupling" is better. There are various options open to us. I propose that we wait until Jin is around and organise a meeting with him and perhaps Phil to find out what they have been doing vis "harmonizing" (coming up with consistent set of variable definintions) for Household SAR (HSAR) and Individual SAR (ISAR) data records. We may or may not want to use their work... In terms of the current population initialisation method (and this is by no means ideal on reflection). I manipulated the HSAR records as follows: Upon loading the Household SAR a random number (either 0, or 1) is added to the AGEH variable (using a fixed random number seed). This has various consequences, but it is to overcome a problem i.e. that of needing to aggregate into odd number of year age groups as are in the ISAR and more importantly the Census Area Statistics (CAS) data (used for control constraint and optimisation in the population initialisation). (For example, Age 15 is an important age, but in the HSAR, we do not know if a record with AGEH=14 is a 14 year old or a 15 year old.) When the data (results of population initialisation) are written out. The age used in the population initialisation is written out too. This means it is possible to compare both what was used, and AGEH (by retrieving it from the HSAR record using the HSAR ID). Thus it is possible to use either value or another for whatever age variable is used in the dynamic model. Regardless of the other options I could have implemented to overcome the problem as outlined above, there remains an issue as to what we use in the dynamic model. It may even be that we want to use/incorporate derived data or processing methods for Jin and Phil's work. Whatever, the coupling will involve some data transfer and possibly in addition some API. For the data transfer we need to agree a format. Now, for the time being, I think it best that we pass a small ASCII CSV format file containing something along the lines of the following: AREA CODE, PERSON ID, HH ID, HSAR ID, ISAR ID, AGE, +? Any record will either have HSAR ID or ISAR ID equal to -9. (BTW: Currently for any area all Communal Establishment Populations will be lumped (there may be more than one Communal Establishment per area)). Records that represent individuals in Communal Establishments have a HH ID equal to -9. (This allows for us constructing households from ISAR recrods at a later date, which may be useful). The file passed should be compressed to large degree. I also propose to provide a JAR file and API with the necessary convenience classes and methods that allow for the manipulation of the data into a form you can readily load into the database/format into files used as input to the dynamic model. I plan to help Belinda write the method that does exactly what is needed now (in the hope that in the future it is easy for Belinda to do this should there be any changes). One of the key advantages of doing the data transfer this way is that we can make the data and the API and source code open to all. Then if a researcher wants to recreate the population we use, all they need do is get a copy of the HSAR and ISAR and run the code. Now, when it comes to interfacing/coupling on the portal side of things, it would be good if both Belinda and I provided to Paul using a common interface. That way, to map 2001 should in theory be the same as to map 2031. So Paul, does what I propose OK with you? I mean, if I gave you the same as Belinda would you be happy with that? Any other thoughts, comment, questions? Best wishes, Andy A.G.D.Turner@leeds.ac.uk http://www.geog.leeds.ac.uk/people/a.turner -----Original Message----- From: Belinda Wu Sent: 10 January 2007 14:29 To: Andy Turner Cc: Mark Birkin; pt@comp.leeds.ac.uk; scsjx@leeds.ac.uk Subject: RE: [MoSeS] Integration of demographic model Hi, Andy, Sorry for the delay of reply. Your blog says "To discuss" the integration and of course I'm open for discussion, as long as we are not to"make some progress on integrating our work so that the dynamic model can begin using outputs from my population initialisation rather than Mark's" beofe the 18th, :-)! In fact I'm not object to the idea of using your Dataloader etc., but we may view the baseline population production in a slightly differnt way. In theory, as I'm dealing with Dynamic Simulation bit, so the baseline data should only need minor manipulation before I can work with, if not straight away: In other words, I consider the need to integrate data from different sources and make them compatible etc.as part of baseline population creation, especially when we have agreed that I won't touch the Grid side of things. In my opinion, eg: we should not having two "Age" fileds: one is 2 years' band and the other is 5 or different Social Economic Groupings etc. etc. in the data you passed to me. Your current proposal asks the Dynamic Model to grab the raw data, so potentially I'll have to deal with the Grid as well as doing a lot of pre-processing. Of course you may think differently, but let's talk! I'm sure that we'll reach an agreement. Best, Belinda -------------------------------------------------------------------------------- From: Andy Turner Sent: Tue 1/9/2007 4:15 PM To: Belinda Wu Subject: RE: [MoSeS] Integration of demographic model Hi Belinda, OK, no problem. Let's leave this for now. It is actioned against me: http://www.geog.leeds.ac.uk/people/a.turner/projects/MoSeS/documentation/meetings/archive/2006-12-20/ We can leave it as on-going. All the best with the PhD upgrade preparation. Bye for now, Andy A.G.D.Turner@leeds.ac.uk http://www.geog.leeds.ac.uk/people/a.turner -----Original Message----- From: Belinda Wu Sent: 09 January 2007 14:58 To: Andy Turner Subject: RE: [MoSeS] Integration of demographic model Hi, Andy, Thanks, but it was not much of a break! Since I'm working through it, preparing for my PhD upgrade meeting. I haven't read any minutes ec. from last meeting. Since I confirmed my action points to everyone and there was no objection, I assume that there is no changes in the plan. That is, I'll try to: 1. deliver a new version of the model including CEP by the end of the month, 2. attempt some parallel progarmming to see how it goes. The work has been up to my neck and I honestly can not do more than that, given the time scale. However, if you want to discuss about the integration again, we can certainly do that at the technical meeting on the 26th. Best, Belinda -------------------------------------------------------------------------------- From: Andy Turner Sent: Tue 1/9/2007 12:45 PM To: Belinda Wu Subject: [MoSeS] Integration of demographic model Hi Belinda, Hope you had a good break and are enjoying 2007 so far. Before the meeting on the 18th we should make some progress on integrating our work so that the dynamic model can begin using outputs from my population initialisation rather than Mark's. There are various options open to us. To begin with we could ignore Communal Establishment Populations (CEPs) altogether. However, when it comes to dealing with CEPs the dynamic model will be required to handle Individual SAR data. I outlined my prefered option for data interchange in the last meeting. This rather ignored the problem of how to come up with a common set of variables for the Household and Individual SARs. This is because we may not have to deal with this problem. You mentioned that an option could be to draw on Jin's work. I think Jin is away for a while still yet. Any comments/questions? Best wishes, Andy A.G.D.Turner@leeds.ac.uk http://www.geog.leeds.ac.uk